Fast Parallel Association Rule Mining without Candidacy Generation

نویسندگان

  • Osmar R. Zaïane
  • Mohammad El-Hajj
  • Paul Lu
چکیده

In this paper we introduce a new parallel algorithm MLFPT (Multiple Local Frequent Pattern Tree) [11] for parallel mining of frequent patterns, based on FP-growth mining, that uses only two full I/O scans of the database, eliminating the need for generating the candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Association Rule Mining with Minimum Inter-Processor Communication

Existing parallel association rule mining algorithms suffer from many problems when mining massive transactional datasets. One major problem is that most of the parallel algorithms for a shared nothing environment are Aprioribased algorithms. Apriori-based algorithms are proven to be not scalable due to many reasons, mainly: (1) the repetitive I/O disk scans, (2) the huge computation and commun...

متن کامل

COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation

Existing association rule mining algorithms suffer from many problems when mining massive transactional datasets. Some of these major problems are: (1) the repetitive I/O disk scans, (2) the huge computation involved during the candidacy generation, and (3) the high memory dependency. This paper presents the implementation of our frequent itemset mining algorithm, COFI, which achieves its effic...

متن کامل

Multi-objective Genetic Algorithm for Association Rule Mining Using a Homogeneous Dedicated Cluster of Workstations

This study presents a fast and scalable multi-objective association rule mining technique using genetic algorithm from large database. The objective functions such as confidence factor, comprehensibility and interestingness can be thought of as different objectives of our association rulemining problem and is treated as the basic input to the genetic algorithm. The outcomes of our algorithm are...

متن کامل

Effective Positive Negative Association Rule Mining Using Improved Frequent Pattern Tree

Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...

متن کامل

Effective Positive Negative Association Rule Mining Using Improved Frequent Pattern

Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001